========================================================
CNN TRANSCRIPTS DATASET (2000-2025)
========================================================

Repository: https://github.com/notnews/cnn_transcripts

NOTICE: For copyright reasons, access is restricted for research purposes only.

DATASET DESCRIPTION:
-------------------
This dataset contains CNN transcript data spanning from January 2000 to March 2025.
The transcripts are divided into eight separate CSV files based on date ranges.

FILE STRUCTURE:
--------------
cnn-1.csv - Data from 2000/01/01 to 2000/04/20 - 7,017 transcripts
cnn-2.csv - Data from 2000/04/21 to 2001/04/03 - 21,381 transcripts
cnn-3.csv - Data from 2001/04/04 to 2002/08/06 - 35,269 transcripts
cnn-4.csv - Data from 2002/08/07 to 2002/09/16 - 2,343 transcripts
cnn-5.csv - Data from 2002/09/17 to 2012/05/18 - 101,336 transcripts
cnn-6.csv - Data from 2012/05/19 to 2014/06/17 - 23,536 transcripts
cnn-7.csv - Data from 2014/06/18 to 2022/02/05 - 102,458 transcripts
cnn-8.csv - Data from 2022/02/01 to 2025/03/15 - 43,562 transcripts

TOTAL TRANSCRIPTS: 336,902

RELATED DATASETS:
---------------
1. CNN Transcripts 2000-2025
   https://doi.org/10.7910/DVN/ISDPJU

2. Top News: Story URLs and Text from News Feeds of Major National News Sites (2022 to 03/2025)
   https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ZNAKK6

3. MSNBC Transcripts: 2003-2022
   https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/UPJDE1

4. Fox News Transcripts (2003-2025)
   https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/Q2KIES

5. Closed Caption News Transcripts from the Internet Archive (2014-2023)
   https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/OAJJHI

USAGE NOTES:
-----------
- Please respect copyright restrictions when using this data
- The dataset is provided for research purposes only
- Citation information can be found in the repository